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In this paper, we describe an IDE called CAPS (Calculational Assistant for Programming from Spec¬ 
ifications) for the interactive, calculational derivation of imperative programs. In building CAPS, 
our aim has been to make the IDE accessible to non-experts while retaining the overall flavor of the 
pen-and-paper calculational style. We discuss the overall architecture of the CAPS system, the main 
features of the IDE, the GUI design, and the trade-offs involved. 


1 Introduction 

Correct by Construction is a programming methodology, wherein programs are derived from a given 
formal specification of the problem to be solved, by repeatedly applying transformation rules to partially 
derived programs. Within this broad framework, Dijkstra and Wim Feijen lITSl popularized the Calcu¬ 
lational style for deriving sequential programs, where unknown program fragments are calculated from 
their pre- and post- conditions. By calculation, we mean that program constructs are introduced only 
when logical manipulations show them to be sufficient for discharging the correctness proof obligations. 

Despite resulting in simple and elegant programs |[T^ , the Calculational Style of Program Derivation 
did not become popular due to the various practical difficulties that prevented wider adoption of this 
methodology. Even for small programming problems, the derivations are often long and difficult to 
organize. As a result, the derivations, if done manually, are error-prone and cumbersome. 

To address these issues, we have built an IDE called CAPS (Calculational Assistant for Programming 
from Specifications^^ CAPS has built-in refinement rules and the system generates the required correct¬ 
ness proof obligations. In building CAPS, our aim has been to make the IDE accessible to nonexperts 
while retaining the overall flavor of the pen-and-paper style derivation. 

Towards this goal, in our earlier work, we described the use of theorem prover assisted tactics flTl 
to automate the mundane tasks during the derivations. In this paper, we discuss the overall architecture 
of CAPS, the main features of the IDE, the GUI design, and the design trade-offs involved. For the 
automation to fit into the overall calculational methodology, we have developed several features, like 
stepping into subcomponents, backtracking, and metavariable support. With the help of small examples, 
we discuss how these features address various issues with particular emphasis on usability. 

Related Work. 

The Implement-and-Verify program development methodology involves an implementation phase fol¬ 
lowed by a verification phase. Tools like Why3 ifT^ . Dafny l[^ . VCC lIT^ and VeriFast lIT^ generate 
the proof obligations and try to automatically discharge these proof obligations. Although the failed 
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proof obligations provide some hint, there is no structured help available to the users in the actual task 
of implementing the programs. Users often rely on ad-hoc use cases and informal reasoning to guess the 
program constructs. 

Systems like Cocktail ifTTll . Refine ll22ll . Refinement Calculator ||9l and PRT ifTOl provide tool support 
for the refinement based formal program derivation. Cocktail offers a proof-editor for first-order logic 
which is partially automated by a tableau based theorem proven However, the proof style is different 
from the calculational style. Refine has a plug-in called Gabriel which allows users to create tactics 
using a tactic language called ArcAngel. Refine and Gabriel are not integrated with theorem provers and 
do not support discharging of proof obligations. In case of Refinement Calculator and PRT, the program 
constructs need to be encoded in the language of the underlying theorem proven In CAPS, our goal 
has been to be theorem-prover agnostic, so that we can exploit the advances made in different theorem 
provers. 

The KIDS and the Spec ware ll2^ systems provide operations for the transformational development 
of programs and have been very successful in synthesizing efficient scheduling algorithms. However, 
these systems are targeted towards expert users. JapejH is a proof calculator for interactive and step-by- 
step construction of proofs in natural-deduction style. Although Jape supports Hoare logic, it is mainly 
intended for proof construction whereas CAPS is focused on program derivation and has many tactics 
specific to program calculations. 

2 An Example of a Calculational Derivation 

We now present a sketch of the calculational derivation for a simple program. Consider the following 
programming task (adapted from exercise 4.3.4 in lIT^ . The informal derivation of this problem also 
appears in CHl). 

Letf[0..N) be an array ofbooleans where N is a natural number. Derive a program for the computa¬ 
tion of a boolean variable r such that r is true iff all the true values in the array come before all the false 
values. 

Fig. 1 depicts the derivation process for this program. We start the derivation by providing the formal 
specification (node A) of the unknown program S. We apply the Replace Constant by a Variable fT3 
heuristic. In particular, we replace constant A by a fresh variable n and add bounds on n to arrive at 
program B. After inspecting the postcondition of program shown in node B, we decide to apply another 
well known heuristic Take Conjuncts as Invariants to arrive at a While program (node C) with Pq and Pi as 
loop invariants. Here, So denotes the unknown loop body. (Derivation of the initialization of the variables 
r and n is skipped.) To ensure loop progress, we envision an assignment r^n := r'^n-\-1 for 5o where 
r' is placeholder for the unknown expression (also called a metavariable). We then step into the proof 
obligation for preservation of invariant Pq and try to manipulate the formula with the aim of finding a 
program expression for the metavariable P. After several formula transformations we arrive at a formula 
E {P = (r A -i/[az]) V (V/: 0 < / < fz + 1 : /[/]))• At this point, we realize that we can not represent P in 
terms of the program variables unless we introduce a fresh variable to maintain (V/ : 0 < / < zi: /[/]). 
We then backtrack to program P, introduce a fresh variable ^ and strengthen the invariant of the While 
program with P 2 . For the derivation of program 5i, we follow the same process as that of So with the 
strengthened invariant. On this derivation attempt, we are able to calculate P with the help of the newly 
added invariant P 2 . Finally we derive s := s A f.n io establish P 2 {n :=n + l)^ The final derived program 
in shown in node H. (Note that we can further improve the program by strengthening the guard.) 


^P 2 {n \=n-\-\) represents a formula obtained by textual substitution of the free occurrences of n with w -h 1 in P 2 
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con N: int {N > 0}; var /: array [O-.A/") of bool; var r: bool; 
S' . . 

(Vi : 0 < i < p : f.i) 

A {yi : p < i < N : -i/.i) 


R : <r= [3p : 0 < p < N : 


)} 


Replace by n and add bounds 


conA^ : int{N > 0}; var / : array[0..N)of bool; var r : bool; 
S 

(Vi : 0 < i < p : f.i) 

A (Vi : p < i < n : ^f.i) 

An = A^A0<n<A^ 


r = (3p : 0 < p < n : 


B 


Take conjuncts Pq and Pi as invariant 


C 


r, n := true^ 0; 

{ invariant \ Pq A Pi } 
while n ^ N ^ 

So 

end 


Strengthen Inv. 
with P 2 


D 


Envision So : r, n := r', n + 1 & 
Step into proof obligation of Pq 


V, n, s := 

true^ 0, true; ^ 

{ invariant : Pq A Pi A P 2 | 

while n 


Si; 


^end 

J 




wp. (r, n := r', n + 1). (Pq) 

= { definition of Pq and assignment } 

(Vi : 0 < i < p : f.i) 
A(Vi:p<i<n + l: ^f.i) 
= { split off p = n + 1; 0 < n + 1} 


r' = (3p:0<p<n + l: 


G 


Calculate r' 


Jr' =(r A ^f.n) V 




H 


Formula Transformations 


E =(r A -i/.n) V (Vi : 0 < i < n + 1 : /•i)|) 


r, n, 5 := t™e, 0, t™e; 

{ invariant : Pq A Pi A P 2 } 
while n ^ N ^ 
s := s A f.n; 

{P 2 {n := n + 1)} 

r, n := (r A ^f.n) V s, n + 1 

end 


Figure 1. Sketch of the calculational derivation for a simple program. Symbols S, So, and are the placeholders 
for the unknown program fragments. The single bordered boxes represent program nodes whereas the double 
bordered boxes represent formula nodes. 

Pq: {r= (Bp :0 < p <n: :0 <i < p : /[/]) A(s/i \ p <i <n \ “>//])))) 

Pi : 0 < ^ < A^; P 2 \s={\/i\t)<i<n \/[/]) 


As can be seen in this example, the calculational derivation involves program transformations as well 
as formula transformations. The derivation process is non-linear involving backtracking and branching. 


3 CAPS 

In building CAPS, our aim has been to build an easy to use IDE for the calculational derivation of 
imperative programs. We have tried to automate the mundane tasks while striving to keep the overall ap- 
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proach close to the pen-and-paper calculational style. All the publicly available IDEs lack in one respect 
or another with respect to the features important for our purpose (for example, structured calculations, 
integration with multiple theorem provers, backtracking and branching). 


3.1 Derivation Methodology 

We use a hierarchical representation called AnnotatedProgram for representing a program fragment along 
with its specification (precondition and postcondition). The AnnotatedProgram representation can be 
thought of as an extension of the Guarded Command Language (GCL) |[T4ll where each program con¬ 
struct in the GCL is augmented with its precondition and postcondition. We also introduce a new program 
construct UnkProg to represent an unsynthesized program fragment. Each subprogram in the annotated 
program representation has its own precondition and postcondition. As we will see in section 5, this 
hierarchical structure is helpful when the user wants to focus on each subprogram independently. 

We use the formulas in sorted first-order predicate logic for expressing the precondition and the 
postcondition of the programs. We use the Eindhoven notation O for expressing the quantified formulas. 
In the quantified formula {OPi : R : T), The symbol OP is the quantifier version of a synunetric and 
associative binary operator op, i is a list of quantified variables, R is the Range - a boolean expression 
typically involving the quantified variables, and T is the Term - an expression. 

Users start a derivation by providing the formal specification of a program and then incrementally 
transform it into a fully derived program by applying predefined transformation rules called Derivation 
Tactics. Eor example, in Eig. 1, the user starts the derivation by providing the postcondition R (node 
A). This program is then transformed incrementally to the final program shown in node H. During the 
derivation, a user might envision a subprogram in terms of the metavariables. The next task for the user 
is to find a program expression for the metavariable such that the proof obligation is discharged. This 
requires formula transformations to simplify the proof obligation. The derivation thus consists of the 
program transformations as well as the formula transformations. These derivation modes are called the 
program mode and the formula mode respectively. A way of transitioning between these two modes is 
described in section 5. The derivation process ends when all the unknown programs are derived. The 
complete derivation history is recorded in the form of the Derivation Tree. 

The final outcome of the program derivation process is the fully annotated program along with the 
complete derivation tree. The AnnotatedProgram can be easily transformed to a program in a real pro¬ 
gramming language. 


3.2 Graphical User Interface 

Eig. 2 shows the Graphical User Interface of the CAPS system. It has three panels. The central panel, 
also called the contents panel, shows a partially derived program (or a formula) at the current stage of the 
derivation. Eor example, the schematic node C in Eig. 1 corresponds to the program in the contents panel 
in Eig. 2. The contents in this panel can be shown at different levels of details, as discussed in section 
6. The left panel, also called the tactics panel shows the list of the tactics applied so far. It corresponds 
to a path the derivation tree. Eor example, the tactics applied from node A to node C in Eig. 1 are listed 
in the tactics panel in Eig. 2. Users can navigate back to an earlier point in the derivation by clicking 
on the corresponding node in the left panel. The bottom panel is the input panel. This panel is used for 
selecting a tactic to be applied next and for providing the corresponding tactic parameters. 
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Figure 2. CAPS GUI 


3.3 System Architecture 

The architecture of the CAPS system is shown in Fig. 3. There are 3 main components of the system: 

• Core Library. The Core library contains the data structures for AnnotatedPrograms, Formula, 
DerivationTree, DerivationTactic and Frame. It also contains a repository of the program and 
the formula manipulation tactics. The Core library is integrated with various automated theorem 
provers (Alt-Ergo, CVC3, SPASS, Z3) via the common interface provided by the Why3 frame¬ 
work ma. The Derivation Tree management utilities are also implemented in this library. The 
library is implemented in Scala and uses the Kiama library ll^ for rewriting. 

• Application Server. The server component is implemented using the Scala play web framework 
Il3. The server stores the current state of the derivation. The application also implements a tactic 
parser which parses the tactic request. 

• Web Client. The CAPS application is implemented as a single-page web application based on the 
Backbone.]s framework HI. The client also maintains a state of the derivation in order to reduce 
server trips for navigational purpose to increase responsiveness of the application. The GUI part is 
implemented in the Typescript language (O (which complies to Javascript). The GUI module has 
different views to display the current state of the derivation. 
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Figure 3. CAPS Architecture 


4 Textual vs Structured Representation 

One important decision in developing an IDE is the choice between a textual representation and a struc¬ 
tural one. While the tools like Dafny |[20ll and Why3 lITbll use textual representations, the structural 
representation is more suitable for a tactic based framework like CAPS. An Annotated Program in CAPS 
has a hierarchical structure consisting of nested programs and formulas. By Structured representation, 
we mean that such hierarchical elements are identifiable in the GUI. As discussed later, this allows the 
user to select and focus on a subprogram or a subformula. Note that doing the same in a text based 
representation will require extra processing Q. 

Direct editing of the Annotated Program may destroy the structure and is disallowed in CAPS; the 
only way to generate a program is through a tactic application. This discipline allows us to capture all the 
design decisions taken during the derivation. However, to allow some informality, we do have tactics to 
directly guess a program fragment (or the next formula). In such cases, the role of a tactic application is 
just to ensure - with the help of theorem provers - that the transformation is correct, and that the structure 
is maintained. 
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Figure 4. Structured representation of a formula in normal mode and selection mode. Users can select a 
subformula by simply clicking on it. 


® lupiU Panel 



Figure 5. Input Panel: On selection of a tactic to be applied, the corresponding input form is dynamically 
generated. 


The contents panel in Fig. 2 shows the structured representation of an annotated program. Fig. 4 
shows the structured representation of a formula in the normal and the selection mode. The binary 
logical operators are shown using the infix notation. Only necessary parentheses are displayed assuming 
the usual precedence. We put more space around the lower precedence operators (like =) to improve 
readability. 

For inputting the tactic parameters, we prefer a dynamically generated GUI instead of a static textual 
input form. On selecting a tactic to be applied next, the corresponding input form is dynamically gener¬ 
ated. Users need not remember the input parameters required for the tactic. Fig. 5 shows the tactic input 
panel for the Init4 tactic which is used for specifying the program. Since CAPS is a web-based applica¬ 
tion, the hypertext-based display enables providing a help menu for input parameters in a user-friendly 
way. 

For entering formulas, however, we prefer textual input. The formulas are entered in the Latex 
format. The formula input box is responsive; as soon as a Latex expression is typed, it converts the 
expression into the corresponding symbol immediately. 
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StepIiitoBAj 
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f:PSAirayInt N: PSInt A; PSInt x: PSInt y: PSInt 

N>1 
f[0] <A 
A<f[N] 

Frame Relation; 


f[x] <A A A<f[y] A 0<x A x<N A x<y A y<N A - 
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=> 
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= Instantiated Meta\ aiiables: y" = y doc 
f [ x' ] <A A A< f [y ] A 0<x' A x'<N A x' <y A y <N 
= SimplifyAuto doc Prooflnfo more 

Figure 6. Formula transformations from the derivation of the Binary Search program. 


5 Focusing on subcomponents 

During the program derivation process, an annotated program is nothing but a partially derived program 
containing multiple unsynthesized subprograms. The derivation of these unsynthesized subprograms is, 
for the most part, independent of the rest of the program. Hence the CAPS system provides a facility 
to extract all the contextual information required for the derivation of a subprogram so that the user 
can focus their attention on the derivation of one of these unknown subprograms. A subprogram can 
be selected by simply clicking on it. On selecting a subprogram, only the extracted context of the 
subprogram, and its precondition and postcondition are shown whereas the rest of the program is hidden. 

Similar to the subprogram extraction, users can chose to restrict attention to a subformula of the for¬ 
mula under consideration. On focusing on a subformula, the system extracts and presents the contextual 
information necessary for manipulating the subformula. 

Our subformula representation is an extension of the Structured Calculational Proof format [Si. The 
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implementation details and the theoretical basis of the contextual extraction is given in ifTTIl . 

Fig. 6 shows a snapshot of the formula transformations involved in the derivation of the binary search 
program. The derivation is displayed in a nested fashion. Whenever the user focuses on a subformula, an 
inner frame is created inside the outer frame. The assumptions available in each frame are displayed on 
the top of the frame. In the figure, as the user focuses on the consequent of the implication, the antecedent 
is added to the assumptions. On successful derivation of all the metavariables, user can step out from the 
formula mode to create a program where the metavariables are replaced with the corresponding derived 
expressions. 

Unlike the hierarchical program structure, the hierarchical formula structure is not usually shown in 
the GUI. This is done to reduce the clutter as the hierarchical formula structure can get very large. It 
is only displayed when we intend to select a subformula. This user interaction mode, called a selection 
mode, is used to select subformulas to be focused on. Fig. 4 shows a formula in the normal mode and in 
the selection mode. 

6 Selective Display of Information 

In the AnnotatedProgram representation, all the subprograms are annotated with the respective precondi¬ 
tion and postcondition. Although this creates a nice hierarchical structure, it results in a cluttered display 
which places higher cognitive demand on the attention and mental resources of the users. An effective 
way to keep the cognitive load low, is to hide information that is not relevant in any given context, such 
as the annotations that can be easily inferred from the other annotations. CAPS provides a Minimal 
Annotations mode which displays only the following annotations. 

• Precondition and postcondition of the outermost program 

• Loop invariants 

• The intermediate-assertion of the Composition construct 

All other annotations can be inferred from these annotations without performing a textual substitu¬ 
tion required for computing the weakest precondition with respect to an assignment statement. Fig. 7 
shows the Integer Division program with full annotations and with minimal annotations. All the hidden 
annotations can be easily inferred from the displayed annotations. The minimal annotations reduce the 
clutter to a great extent. 

In addition to the annotations, there are lots of other details that can be hidden. For example, the 
discharge status of various proof obligations for the SimplifyAuto tactic can run into several pages, and 
is hidden by default (The Proofinfo link in the Fig. 6). The annotated programs can also be collapsed by 
double clicking on them. 

7 Maintaining Derivation History 

Invariant and assertion annotations help in understanding and verifying a program. However, they pro¬ 
vide little clue about how the program designer might have discovered them. For example, at node E 
in the derivation in Fig. 1, we are unable to express the expression under consideration in terms of the 
program variables. This guides us to introduce a fresh variable ^ and strengthen the invariant with Pz. 
This crucial information is missing from the final annotated program. It is therefore desirable to preserve 
the complete derivation history to fully understand the derivation of the program. CAPS maintains the 
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Figure 7. Final AnnotatedProgram for the Integer Division problem: a) Full annotations mode, b) 
Minimal annotation mode. 


derivation history in the form a derivation tree. Maintaining history also facilitates backtracking and 
branching if the user wants to try out an alternative derivation strategy. 

Backtracking and Branching. 

In CAPS, we do not allow programmers to directly edit the program; users have to backtrack and branch 
to try out different derivation strategies. This restriction ensures that the derivation tree contains all the 
information necessary to reconstruct the program from scratch. All the design decisions are manifest in 
the derivation tree which helps in understanding the rationale behind the introduction of various program 
constructs and invariants. Using the branching functionality, users can explore multiple solutions for the 
given programming task. 

Navigating the Derivation tree 

The conventional tree interface is not suitable to showing the derivation tree. At any point during a 
derivation, we are interested in only the active path of the derivation tree. This active path is shown in 
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Figure 8. Navigating the derivation tree: Fig. (a) shows schematic diagram of a derivation tree. Fig. (b) 
shows the path in the derivation tree containing the currently selected node (node 12). A marker (a filled 
circle) to the right of node 12 indicates the presence of a right-sibling node (node 37) in the derivation 
tree. Users can click on this sibling marker to switch to the branch containing node 37.. The resulting 
path is shown in Fig. (c). 



the left panel in the GUI. To make it easy to navigate to other branches, we show siblings of the nodes 
in the path. Users can navigate across the branches by clicking the sibling markers as shown in Fig. 8. If 
there are multiple branches under the selected sibling, then the rightmost branch is selected. 

8 Conclusions and Future Work 

In this work, we have described the design of an IDE for the Calculational Derivation of Imperative 
Programs. Our design focus has been on making the IDE accessible to nonexperts while retaining the 
overall flavor of the pen-and-paper style derivation. We have used the CAPS system in an elective course 
on Program Derivation taken by 2nd year students. The preliminary student response to the tool has 
been very positive lfT^ . However, a thorough evaluation needs to be done on more challenging problems. 

Based on the learnings from the first offering of the tool, we plan to enhance the tool in a number of 
ways. 

Richer Language Constructs. 

We plan to target programs with richer constructs involving recursion, algebraic data types, and poly¬ 
morphic types. 

Executing programs. 

We currently do not have a functionality to execute the derived programs in CAPS. We plan to explore the 
possibility of executing not only the final program, but also the intermediate partially derived programs. 
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Being able to simulate programs at the intermediate stages of behavioral abstraction has already been 
identified ll2T1l as one of the barriers in the adoption of the stepwise refinement based methods. 

Integrating Synthesis Solvers. 

We plan employ the synthesis solvers during the interactive derivation when the specification of the 
subprogram under consideration falls in a theory for which a synthesis solver is available. We will, 
however, restrict the use of these solvers to the synthesis of loop-free programs. 
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